对 HTML 的理解

This is the definition of HTML from W3C:

HTML is the language for describing the structure of Web pages. HTML gives authors the means to:

  • Publish online documents with headings, text, tables, lists, photos, etc.
  • Retrieve online information via hypertext links, at the click of a button.
  • Design forms for conducting transactions with remote services, for use in searching for information, making reservations, ordering products, etc.
  • Include spread-sheets, video clips, sound clips, and other applications directly in their documents.

With HTML, authors describe the structure of pages using markup. The elements of the language label pieces of content such as “paragraph,” “list,” “table,” and so on.

This is the definition of HTML from Wiki:

HyperText Markup Language, commonly abbreviated as HTML, is the standard markup language used to create web pages. Along with CSS, and JavaScript, HTML is a cornerstone technology used to create web pages, as well as to create user interfaces for mobile and web applications. Web browsers can read HTML files and render them into visible or audible web pages. HTML describes the structure of a website semantically and, before the advent of Cascading Style Sheets (CSS), included cues for the presentation or appearance of the document (web page), making it a markup language, rather than a programming language.

This is the definition of HTML from Mozilla:

HTML is a fairly simple language made up of elements, which can be applied to pieces of text to give them different meaning in a document (is it a paragraph? is it a bulleted list? is it part of a table?), structure a document into logical sections (does it have a header? three columns of content? a navigation menu?) and embed content such as images and videos into a page.

HTML 是一种标记语言,因为标记的内容要在网页上显示,所以标记的文本是 web 对应的超文本,所以叫作超文本标记语言。它和 CSS, JavaScript,共同构成了网页制作的三大基石。

三大基石各有分工,其中 HTML 的具体作用不太好给出准确的定义,上面给出的三个定义也都各不相同。
但是 HTML 是一个固定的东西,不会因为你说得不同就变得不同,三个定义应该是同一个意思的不同表达。Wiki的定义更符合我的理解方式,HTML就是定义内容结构的,但是结构分大结构和小结构,大结构指标题,导航栏,主题栏,这些容易区分的大版块,即 Mozilla 定义的 structure;而小结构就是大结构下的段落,列表,表格等,即 pieces of text having different meaning (具有不同含义的文本碎片)。但是大结构同样也是有含义的,和小结构一样也是由带有具体含义的标签定义的。所以我觉得说 HTML 是用来语义化描述内容结构比较准确。

我在 Learning Web Design 这本书中看到了更好的定义,元素之间的跟随和嵌套自然就构成的结构。

Definition from Learning Web Design, 4th Edition
In addition to adding meaning to content, the markup gives the document structure. The way elements follow each other or nest within one another creates relationships between the elements. You can think of it as an outline (its technical name is the DOM, for Document Object Model). The underlying document hierarchy is important because it gives browsers cues on how to handle the content. It is also the foundation upon which we add presentation instructions with style sheets and behaviors with JavaScript.

结构和语义一般都是同时定义的,由同一个标签定义,只是有的标签定义大结构,有的标签定义小结构。

HTML 中有些标签是不带语义的,比如<div> <span>,这些标签最好只在找不到合适带语义的标签时再使用。因为虽然网页上显示时并没有区别,但是不带语义的标签对搜索引擎优化和残疾人使用的屏幕阅读器都是不友好的。

As we begin our tour of elements, I want to reiterate how important it is to choose elements semantically, that is, in a way that most accurately describes the content’s meaning. If you don’t like how it looks, change it with a style sheet. A semantically marked up document ensures your content is available and accessible in the widest range of browsing environments, from desktop computers and mobile devices to assistive screen readers. It also allows non- human readers, such as search engine indexing programs, to correctly parse your content and make decisions about the relative importance of elements on the page.
From Learning Web Design, 4th Edition